This report gives a brief summary of the textual analysis of the submissions to the Visitor Visa Survey by the Select Committee for Petitions.

Summary of Key Points

Frequency Analysis

The comments had an average of 42 words in each, an average Flesch readability score of 48 suggesting readers needed to be educated to at least a UK Grade Level of 11 to understand the comments. As this engagement activity was in survey format, the people making submissions were educated at least to GCSE level.

The most common adjectives, phrases and pairs of words are displayed below. People tend to express their emotions through the adjectives they use, and in this case “expensive”, “financial”, and “long” being used so often relate to the time and financial commitment of the application process. The key words raise issues of “health insurance” and “mental health” as other important aspects of the process.

A network of the most frequent consecutive word pairs (bigrams) is shown below. “visit visa”, “application process”, and “home office” are the most common word pairs in the dataset. Phrases such as “time consuming”, “bank statements”, and “immigration rules” are also common and suggest pockets of comments which raise these issues often outside of the more general comments. “Children” and “grandparents” are also closely connected to the main cluster of phrases suggesting these family members are especially affected during the visa application process.


Topic Extraction

A plot of words most associated with one of 8 topics are shown below. Most of the topics are around the same are of…. however two topics stand out. Topic 8 is primarily about the technicalities of the application process with words such as “home office”, “6 months”, “application process”, and “long tedious”. Topic 6 is primarily about the lengthy application process and the costs associated with this, for example “paper work”, “took long”, “long expensive”.

Topic model visualisations are split into two sections:

This visualisation is interactive, hover over each topic number to view the words in each topic, or select each word to view which topics it appears.

##      row col
##      row col

LDAvis

Sentiment Analysis

The wordcloud below gives the most popular words associated with positive and negative sentiments in the survey. Specific comments which are associated with the most popular sentiments are listed below.

The NRC sentiment lexicon uses categorical scale to measure 2 sentiments (positive and negative), and 8 emotions (anger, anticipation, disgust, trust, joy, sadness, fear, and suprise). Examples of words and comments in these sentiment categories are below. In this debate, the majority of submissions were negative but also categorised as anticipation and positive.

Hover over the plot below to read the content of the comments within each sentiment category.

## [1] 7
## 
##        anger anticipation      disgust         fear          joy 
##   0.06328951   0.13366847   0.03995875   0.08391338   0.09216293 
##     negative     positive      sadness     surprise        trust 
##   0.14887858   0.15931941   0.11317350   0.04215004   0.12348543

An example of a comment categorised as negative

Got refusal. We are really upset because we lost time; money and feeling as well.

An example of a comment categorised as anticipation

A long wait, uncertainty about the visa meant we couldn’t plan things in advance

An example of a comment categorised as positive

family visit appliaction join us on birth of my new born baby